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TITLE OF THE INVENTION 



Selective Document Processing System and Method 



TECHNICAL FIELD 



The present invention is generally related to document processing and, more 
particularly, is related to a selective document processing system and method to 
selectively control the processing of information on documents and the like. 



More and more documents are generated using word processors and the like 
and are stored on memory devices, such as hard drives, floppy disks, compact disks 
and other mass storage media. Nonetheless, paper and other similar media will 
continue to be used far into the future. Consequently, there will continually be a need 
to scan the substance portrayed on such media so that such information may be 
manipulated on a computer or other like device. 

However, the scanning of paper documents to make the content thereon 
available in a digital environment may be time consuming and costly. In particular, 
one problem is that the processing of various regions of scanned documents may take 
a long time requiring the user to wait for the processing of a whole document. 
Oftentimes, a user may only want to access a portion of the text, artwork, or other 
region of the scanned document, rather than the entire document such as cases where 
specific paragraphs of text are sought from a document. However, current users are 
often forced to wait while scan converter technology analyzes an entire document to 
determine the specific types of the various regions that may then be processed by 
various processing pipelines such as optical character recognition pipelines, etc. 



BACKGROUND OF THE INVENTION 
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SUMMARY OF THE INVENTION 
To address the above stated problems, the present invention provides for a 
selective document processing system and method. In one embodiment, the selective 

5 document processing system includes a digital document analyzer configured to 
determine a number of regions on a digital document and a data type for each of the 
regions, the data type for each region being one of a number of predefined data types. 
The system also includes a first user interface to display the analyzed digital document 
and to allow the user to perform various functions relative to the displayed digital 

10 document including selecting desired regions, deleting regions, etc. The system also 
includes a selection interface activated from the first user interface for identifying at 
least one of the predefined data types that are displayed on the first user interface for 
viewing and further processing in predetermined processing pipelines. 

The present invention can also be viewed as providing a method for 

15 controlling document region analysis. In this regard, the method can be broadly 
summarized by the following steps: analyzing a digital document to determine a 
number of regions thereon and a data type for each of the regions, the data type for 
each region being one of a number of predefined data types; and, identifying at least 
one of the predefined data types for further processing. 

20 The present invention includes various advantages such as providing the user 

with more efficient document processing as unwanted data types need not be manually 
eliminated by simply selecting only desired data types in the selection interface or by 
manually deleting unwanted data types. This is especially the case for mass document 
processing in which only specific data types are sought from a number of documents 

25 that are consecutively processed. Also, the user is spared the difficulty of viewing a 
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digital document on the first user interface that may be cluttered with unwanted data 
types. The present invention is also simple in design, user friendly, robust, reliable, 
and efficient in operation, and easily implemented for mass commercial production. 

Other features and advantages of the present invention will become apparent to 
5 one with skill in the art upon examination of the following drawings and detailed 

description. It is intended that all such additional features and advantages be included 
herein within the scope of the present invention. 

BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE ©RA WINGS 
10 The invention can be better understood with reference to the following 

drawings. The components in the drawings are not necessarily to scale, emphasis 

instead being placed upon clearly illustrating the principles of the present invention. 

Moreover, in the drawings, like reference numerals designate corresponding parts 

throughout the several views. 
15 FIG. 1 is a block diagram of a selective document processing system according 

to an embodiment of the present invention; 

FIG. 2 is a drawing of a first user interface shown on a display screen of the 

selective document processing system of FIG. 1 ; 



20 selective document processing system of FIG. 1; and 

FIG. 4 is a flow chart of selective processing logic stored and executed by the 
selective document processing system of FIG. 1. 



FIG. 3 is a drawing of a selection interface shown on the display screen of the 
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DETAILED DESCRIPTION 
Referring to FIG. 1 , shown is a block diagram of a document processing 
system 100 according to an embodiment of the present invention. The selective 
document processing system 100 includes a computer system 103 which comprises a 
5 processor 106, and a volatile/nonvolatile memory 113, both of which are coupled to a 
local interface 116. The local interface 1 16 comprises, for example, a data bus and a 
control bus, or other like structure. The computer system 103 further comprises a 
video interface 1 19, a number of input interfaces 123, a modem 126, a number of 
output interfaces 129, and a mobile data storage device 133, all of which are also 

10 coupled to the local interface 116. The memory 1 13 may include, for example, a 

random access memory (RAM), a read only memory (ROM), a hard drive, and other 
like devices, or any combination of these devices. Note that the term volatile refers to 
memory devices that generally lose data stored therein upon loss of power, and non- 
volatile refers to memory devices that do not lose data upon loss of power. 

15 The selective document processing system 100 also includes a display device 

136 that is coupled to the local interface 1 16 via the video interface 1 19. The display 
device may be, for example, a cathode ray tube (CRT), a liquid crystal display (LCD), 
or other similar display device. The system 100 also includes several input devices, 
namely, a keyboard 139, a mouse 143, a microphone 146, and a scanner 149 that are 

20 all coupled to the local interface 1 16 via the various input interfaces 123. In addition, 
the modem 126 is coupled to an external network 153 thus allowing the computer 
system to send and receive data via the external network 1 53. The external network 
153 may be, for example, the Internet, local area network (LAN), wide area network 
(WAN), or other similar network. 
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The selective document processing system 100 may further include audio 
speakers 156, a printer 159, or other output devices that are coupled to the local 
interface 116 via the output interfaces 129. The mobile data storage device 133 may 
be one of several such devices that allow storage of data on a mobile platform such as 
5 a floppy disk drive, compact disc drive, mobile hard drive, mobile fixed memory, or 
other similar data storage device. 

The selective document processing system 100 also includes selective 
processing logic 170 that is generally stored on the memory 113 along with data 176. 
In one embodiment of the present invention, the memory 1 13 comprises a 

10 combination of RAM, ROM, and a hard drive, although other combinations may be 
used. In one embodiment, the selective processing logic 170 is software that is stored 
on the hard drive and the data 176 is also stored on the hard drive. When the selective 
document processing system 100 is operational, pertinent portions of the selective 
processing logic 170 are loaded into the RAM and are executed by the processor 106. 

15 During operation of the selective document processing system 100, the selective 

processing logic 170 may access pertinent portions of the data 176 stored on the hard 
drive, loading them into the RAM for various purposes. For example, the data 176 
may comprise a digital document such as a bit map image of a scanned document 
received from the scanner 149. The data 176 may also be accessed via the mobile 

20 data storage 133 or the external network 153. 

The display device 136 is employed to display any one of a number of user 
interfaces 181 that are viewed by the user. The user may also interface with the 
computer system 103 via the input devices such as the keyboard 139, mouse 143, 
microphone 146, or other input devices. The user receives audio output from the 
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audio speakers 156 and the computer system 103 may print out various documents 
created on the printer 1 59. 

Note that although the above implementation of the present invention is 
discussed in terms of a processor circuit and software, it is understood that other 
5 embodiments of the present invention include a dedicated logical circuit that 

accomplishes the functionality of the selective processing logic 1 70, or a combination 
circuit which includes a processor circuit with software and specific dedicated 
circuits. It is understood that all such permutations of various implementations are 
included herein. 

10 The selective document processing system 100 advantageously provides a 

flexible system for processing digital documents received via the scanner 149, 
external network 153, mobile data storage 133, or stored in the memory 113. In 
processing the digital documents, the system 100 identifies one or more regions on the 
digital document that comprise uniform information such as a specific text, artwork, 

15 or a photo, etc. Thereafter, the regions are applied to appropriate processing pipelines 
according to specific criteria discussed later in detail. The processing pipelines may 
comprise, for example, optical character recognition or photo processing algorithms. 
The resulting processed regions are then recombined and dumped into a desired 
destination application that may be, for example, a word processor, or other similar 

20 application. 

Referring then, to FIG. 2, shown is a first user interface 181a. The first user 
interface 181a includes a menu bar 203 from which a number of pulldown menus 206 
may be accessed. The pulldown menus 206 include File, Edit, View, Settings, Select, 
Clear, and Help menus, although others may be employed. Each pulldown menu 206 
25 may be accessed by positioning a mouse pointer 209 thereon and "clicking" the mouse 
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143 (FIG. 1). The term "clicking" the mouse 143 refers to the action of pressing an 
appropriate button on the mouse 143, thereby providing an input signal to the 
computer system 103. The simultaneous actions of positioning the mouse pointer 209 
on an item on the user interface 181a and clicking the mouse 143 is generally called 
5 "clicking on" that item. Note the pulldown menus 206 may be accessed by pressing 
appropriate buttons on the keyboard 139 (FIG. 1) as well, although generally the use 
of the mouse 143 is often preferred. In addition, voice commands may be employed 
to replace the functions of the mouse 143 and keyboard 139 by using predetermined 
voice commands. Although there may be several options for the user to pursue in 
10 each of the pulldown menus 206, only those pertinent to the present invention are 
discussed herein. 

The first user interface 181a also includes a destination application indicator 
213. The destination application indicator 213 includes a picklist (not shown) of a 
number of destination applications that can be accessed by clicking on a picklist 

15 button 216 associated with the destination application indicator 213. The destination 
applications are those software and/or hardware applications with which the selective 
document processing system 100 interfaces. That is to say, these software and/or 
hardware applications are the applications to which the information in each of the 
before identified regions is applied. These may include a word processor, a photo 

20 processor, a drawing package, an email package, a publishing package, a document 
creator, a forms package, a web page maker, databases, operating system clipboards, 
or other applications. Note that the destination application may also include storage 
as a file, printing on a printer, transmission by facsimile, or printing via a copier as 
well. To give a specific example, the text in a region an identified digital document 

25 may be applied to a word processor or the like. 
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The first user interface 181a also displays a digital document 219 that includes 
at least one region 223. The digital document 219 displayed is that which is identified 
by the user and is thereafter analyzed and displayed accordingly. The total number of 
regions 223 shown in FIG. 2 is five, although a greater or lesser number of regions 
5 223 may exist. Note that the regions 223 are numbered from one to five, although in 
the preferred embodiment, the actual text, artwork, or photos in each region is shown. 
The regions 223 are identified by performing a document analysis on a specified 
digital document received from the scanner 149, the external network 153, the mobile 
data storage 133, or the memory 113. The document analysis identifies the regions 

10 223 by examining the information on the digital document 219 and detecting specific 
data types thereon. The regions 223 are formed encompassing each area in which the 
information is of a single data type. There are several data types that can be identified 
such as, for example, true color photos, grayscale photos, color logos, black & white 
logos, tables, spot color art, text, page headers, page footers, titles, indexes, tables of 

1 5 contents, and other data types. 

The first user interface 181a also includes a region selection button 226 that 
controls the access to the regions 223. When depressed, the region selection button 
226 allows the user to highlight or choose any one or more of the regions 223 by 
clicking thereon. A highlighted region may be, for example, deleted or altered by the 

20 user using the keyboard 139 or the mouse 143. If a region is double clicked, then that 
region is immediately processed by the processing pipelines as stated previously. The 
user may also click on the magnify button 229 or the demagnify button 233 in order to 
zoom in and out on the digital document 223 or a particular region 223 thereon. 



25 button 239, and a help button 243. The accept button 236 allows the user to apply all 



The first user interface 181a also includes an accept button 236, a cancel 
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highlighted regions 223 to the appropriate processing pipelines. Note the same can be 
done for a specific region 223 by double clicking on that region 223. When the user 
clicks on the cancel button 239, the function of the selective document processing 
system 100 ceases. Clicking on a help button 243 brings forth a help menu to provide 
5 aid and operating the selective document processing system 100. 

Turning then, to FIG. 3, shown is a selection interface 181b according to 
another embodiment of the present invention. To the selection interface 181b is 
displayed on the display device 136 by the clicking a menu item on the settings menu 
called "page elements". The selection interface 181b provides a list of the various 

10 data types 283 that can be identified by the selective document processing system 100. 
Beside each of the data types 283 is a selection indicator 286. The selection indicator 
286 may also be considered a toggle mechanism. As shown, the data types 283 
comprise true color photo, grayscale photo, color logo, black and white logo, table, 
spot color art, text, page header, page footer, titles, index, and table of contents. Note 

15 that this list is not intended to be all-inclusive as other nonlisted data types may be 
included as well. The selection indicator 286 shows a check mark when they 
particular data type is selected and is blank when they particular data type is not 
selected. The user can toggle between the selected and not selected states by clicking 
on the appropriate selection indicator 286 with the mouse 143 (FIG. 1). 

20 The selection interface 181b controls the specific data types 283 that appear in 

the digital document 219 in the first user interface 181a in that only selected data 
types 283 appear. In addition, only those selected data types 283 undergo further 
processing in the processing pipelines and are ultimately applied to a destination 
application. Thus, the selection interface 181b provides a distinct advantage in that a 

25 user can focus on predetermined data types 283 when processing documents on a 
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mass scale by selecting only certain desired data types 283 in the selection interface 
181b. The user can thereby minimize the time spent to manually manipulate the 
digital documents 219 by, for example, selecting multiple regions of the desired data 
type 283 to be applied individually to the processing pipelines, or collectively 

5 applying multiple regions 223 by pressing the accept key 236 (FIG. 2). According to 
the present invention, a default setting for the selection interface 181b is stored in the 
memory 1 13 (FIG. 1) and the selection interface 181b features this setting at startup of 
the selective document processing system 100. 

Reference is now made to FIG. 4 in which a flow chart of the selective 

10 processing logic 170 a shown. Beginning with block 303, the digital document 219 
(FIG. 2) that is to be processed is identified. This digital document 219 may be 
identified simply by scanning the document with the scanner 149 which automatically 
triggers the activation of the selective document processing system 100 for the 
document scanned. The digital document 219 may also be chosen using a picklist or 

15 "open file" option from the file menu (FIG. 1). Once the digital document 219 is 

identified, the logic 170 progresses to block 306 in which the digital document 219 is 
analyzed and the various data types 283 thereon are identified and the various regions 
223 (FIG. 2) formed by the data types 283 are isolated. Thereafter, the logic 170 
progresses to block 309 in which the digital document 219 is displayed including the 

20 regions 223 on the first user interface 181a (FIG. 2). As previously mentioned, only 
the regions 223 that have been selected based on the selection interface 181b (FIG. 3) 
are displayed on the first user face 181a. 

Next, in block 313, the logic 170 determines whether the selection interface 
181b has been selected by the user from the settings menu. If the selection interface 

25 181b is selected, then the logic 170 moves to block 316 in which the selection 

10 



Docket No. 10990419 

interface 181b is displayed on the display device 136 (FIG. 1). Thereafter the logic 
170 progresses to block 319 in which the various data types 283 are selected or 
deselected based upon the user manipulation of the selection indicators 286 (FIG. 3) 
as was previously discussed. 

5 However, if in block 313, the selection interface 181b has not been selected by 

the user from the settings menu, then the logic 170 progresses to block 323 in which it 
is determined whether the accept button 236 has been depressed (assuming desired 
regions 223 have been highlighted by clicking thereon), or whether the user has 
double-clicked on a particular region. If not, the logic 170 reverts back to block 313. 

10 If so, then the logic 170 progresses to block 326. In block 326, the appropriate 

processing pipelines are identified based upon the selected data types in the selection 
interface 181b and the selected destination application identified in the destination 
application indicator 213. The pipelines may include, for example, optical character 
recognition algorithms, raster to vector conversions, processing for color photos, 

15 processing for grayscale photos, processing for tables. Thereafter, the logic 170 

progresses to block 329 where the selected regions 223 are applied to the identified 
processing pipelines and processed accordingly. The results are then combined and 
provided to the identified destination application for further manipulation by the user. 
The present invention provides several distinct advantages to the user in 

20 analyzing documents. For example, the present invention provides a user with faster 
and more efficient document processing as unwanted data types need not be examined 
or manually eliminated by simply selecting only desired data types in the selection 
interface 181b. This is especially the case for mass document processing in which 
only specific data types are sought from a number of documents that are consecutively 

25 processed. Also, the user is spared the difficulty of viewing a digital document on the 
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first user interface 181b that may be cluttered with unwanted data types. The present 
invention also allows the user to prevent the creation of data types the destination 
application will not process such as, for example, unwanted "tables" which may be 
sent to a photo editor and stored as a photo and not as tables. 



operation of a possible implementation of the selective processing logic 170 (FIG. 1). 
In this regard, each block represents a module, segment, or portion of code, which 
comprises one or more executable instructions for implementing the specified logical 
function(s). It should also be noted that in some alternative implementations, the 
10 functions noted in the blocks may occur out of the order noted in FIG. 4. For 
example, two blocks shown in succession in FIG. 4 may in fact be executed 
substantially concurrently or the blocks may sometimes be executed in the reverse 
order, depending upon the functionality involved, as will be further clarified 
hereinbelow. 

15 The selective processing logic 170, which preferably comprises an ordered 

listing of executable instructions for implementing logical functions, can be embodied 
in any computer-readable medium for use by or in connection with an instruction 
execution system, apparatus, or device, such as a computer-based system, processor- 
containing system, or other system that can fetch the instructions from the instruction 

20 execution system, apparatus, or device and execute the instructions. In the context of 
this document, a "computer-readable medium" can be any means that can contain, 
store, communicate, propagate, or transport the program for use by or in connection 
with the instruction execution system, apparatus, or device. The computer readable 
medium can be, for example but not limited to, an electronic, magnetic, optical, 

25 electromagnetic, infrared, or semiconductor system, apparatus, device, or propagation 
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In addition, the flow charts of FIG. 4 show the architecture, functionality, and 
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medium. More specific examples (a nonexhaustive list) of the computer-readable 
medium would include the following: an electrical connection (electronic) having one 
or more wires, a portable computer diskette (magnetic), a random access memory 
(RAM) (magnetic), a read-only memory (ROM) (magnetic), an erasable 

5 programmable read-only memory (EPROM or Flash memory) (magnetic), an optical 
fiber (optical), and a portable compact disc read-only memory (CDROM) (optical). 
Note that the computer-readable medium could even be paper or another suitable 
medium upon which the program is printed, as the program can be electronically 
captured, via for instance optical scanning of the paper or other medium, then 

10 compiled, interpreted or otherwise processed in a suitable manner if necessary, and 
then stored in a computer memory. 

Many variations and modifications may be made to the above-described 
embodiment(s) of the invention without departing substantially from the spirit and 
principles of the invention. All such modifications and variations are intended to be 

15 included herein within the scope of the present invention. 
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